Model Selection

Audio to Text

# Audio to Text

Ultravox V0 5 Llama 3 2 1b GGUF

Ultravox v0.5 is an audio-to-text model optimized from the Llama-3 2.1B architecture, focusing on efficient speech transcription tasks.

Speech Recognition

Gemma 3 4b It Q4 0

Gemma 3 4B Instruct is a 4-billion-parameter large language model developed by Google, focusing on text generation and comprehension tasks.

Large Language Model

Speechless Llama3.2 V0.1 I1 GGUF

This is the result of weighted/importance matrix quantization of the Menlo/Speechless-llama3.2-v0.1 model, offering multiple quantization versions

Large Language Model Supports Multiple Languages

Whisper Large V3.w4a16

This is the quantized version of openai/whisper-large-v3, employing INT4 weight quantization and FP16 activation quantization, suitable for vLLM inference.

Speech Recognition

Transformers English

Wav2vec2 300m Teste4

A speech recognition model fine-tuned on the common_voice dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase